We all agree that 2020 was a year like no other, the world was marked by the Sars-CoV-2 virus pandemic. It caused millions of people to die, tens of millions became sick, hundreds of millions were quarantined, and billions of people have had their lives changed - including us. The pandemic, though it continues to develop, there is light in this dark tunnel - a vaccine.
Vaccines typically require years of research and testing before reaching the clinic, but in 2020, scientists embarked on a race to produce safe and effective coronavirus vaccines in record time - many thanks to them!
Today we will take look at the COVID-19 World Vaccination Progress dataset and let’s gain some insights on how the vaccination progresses.
library(data.table)
library(dplyr)
library(stringr)
library(ggplot2)
library(plotly)
library(glue)
library(sp)
library(maps)
library(maptools)
library(leaflet)# read data
vaccine <- read.csv("country_vaccinations.csv")
head(vaccine,5)The data contains the following information:
It’s a good practice that we cleanse our data before going any further with the analysis.
We have studied 2 new data types namely datetime64 and category. Which columns must be converted?
datetime64: datecategory: country, vaccines, source_namevaccine <- vaccine %>%
mutate(date = as.Date(date),
across(c(country,vaccines,source_name), as.factor))Before going any deeper, find out the time range of our analysis provided by the dataset:
summary(vaccine['date'])#> date
#> Min. :2020-12-13
#> 1st Qu.:2021-01-11
#> Median :2021-01-23
#> Mean :2021-01-21
#> 3rd Qu.:2021-02-03
#> Max. :2021-02-15
As of February 11th, 2021, researchers are testing 69 vaccines in clinical trials on humans, and 20 have reached the final stages of testing. At least 89 preclinical vaccines are under active investigation in animals. Source: The New York Times
Find out which combination of vaccine types are used by many countries?
country_vaccine <- vaccine %>%
distinct(country, vaccines)vec <- unlist(strsplit(as.character(country_vaccine$vaccines),", "))
vaccine_count <- as.data.frame(vec) %>%
count(vec) %>%
arrange(desc(n))
vaccine_countvaccine_count2 <- vaccine_count %>%
mutate(label = glue("Vaccine: {vec}
Count: {n}"))
fig <- ggplot(data=vaccine_count2, mapping=aes(x=n, y=reorder(vec,n), text=label)) +
geom_col(aes(fill=n), show.legend = F) +
theme_minimal() +
labs(
title = "Number of Country use Types of Vaccine",
y = "Vaccines",
x = NULL
) +
scale_fill_gradient(low = "seagreen1", high = "purple") +
theme(panel.background = element_rect(fill = "white"),
panel.grid = element_line(color = "snow2"))
ggplotly(fig,tooltip = "text")Currently, Indonesia using the Sinovac vaccine. From the result above there are 2 country that using Sinovac vaccine. Say we want to get the information about which country that using vaccine Sinovac other than Indonesia.
# additional: conditional subsettingIn this section, we would like to know which country is the leader in vaccination?
First, let’s find out top 5 countries with the highest sum of daily_vaccinations
top5_country <-
head(vaccine %>%
filter(country != 'England') %>%
group_by(country) %>%
summarise(total = sum(daily_vaccinations, na.rm=TRUE)) %>%
arrange(desc(total)), 5)
top5_countryplotdata <- top5_country %>%
mutate(country = recode(country,
"United States" = "USA",
"United Kingdom" = "UK"))
plotdataworld <- map("world", fill=TRUE, plot=FALSE)
world_map <- map2SpatialPolygons(world, sub(":.*$", "", world$names))
world_map <- SpatialPolygonsDataFrame(world_map,
data.frame(country=names(world_map),
stringsAsFactors=FALSE),
FALSE)
target <- subset(world_map, country %in% plotdata$country)
target_sf <- sf::st_as_sf(target)
#
target_sf <- target_sf %>%
left_join(plotdata, by = c("country"))# %>%
pal <- colorNumeric(palette = "YlOrRd", domain = target_sf$total)
labels <- glue("<b>{target_sf$country}</b><br>
Number of Daily Vaccine: {prettyNum(target_sf$total, big.mark = ',')}") %>%
lapply(htmltools::HTML)
leaflet(target_sf) %>%
addTiles() %>%
addPolygons(weight=2,
label=labels,
fillColor = ~pal(target_sf$total),
fillOpacity = .8,
color = "grey",
highlight = highlightOptions(
weight = 2,
color = "black",
bringToFront = TRUE,
opacity = 0.8
))%>%
addLegend(
pal = pal,
values = ~total,
labels = ~total,
opacity = 1,
title = "Vaccine Number :",
position = "bottomright"
)